The translanguage English database (TED)

نویسندگان

  • Lori Lamel
  • Florian Schiel
  • Adrian Fourcin
  • Joseph-Jean Mariani
  • Hans G. Tillmann
چکیده

The Translanguage English Database is a corpus of recordings made of oral presentations at Eurospeech93 in Berlin. The corpus name derives from the high percentage of presentations given in English by non-native speakers of English. 224 oral presentations at the conference were successfully recorded, providing a total of about 75 hours of speech material. These recordings provide a relatively large number of speakers speaking a variant of the same language (English) over a relatively large amount of time (15 min each + 5 min discussion) on a specific topic. A subset of speakers were recorded with a laryngograph in addition to the standard microphone. A set of Polyphone-like recordings were made, for which a subset also had a laryngograph signal recorded. These recordings were made in English and in the speaker's mother language. In addition to the spoken material, associated text materials are being collected. These include written versions of the proceedings papers and any oral preparations texts which were made available. The text materials will provide vocabulary items and data for language modeling. Speakers were also asked to complete a short questionnaire regarding their mother language, any other languages they speak, as well as their knowledge of English.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The ISL Baseline Lecture Transcription System for the TED Corpus

This paper describes the Interactive Systems Laboratories’ automatic lecture transcription system for the Translanguage English Database (TED) corpus, which provides text-hypothesis for the International Workshop on Speech Summarization for Information Extraction and Machine Translation. Furthermore the paper gives a short analysis of speaking style characteristics, in particular addressing nat...

متن کامل

Spontaneous speech consolidation for spoken language applications

This paper describes the work done as a part of the International Workshop on Speech Summarization for Information Extraction and Machine Translation (IWSpS) , on spoken language processing including summarization, machine translation and question answering on lecture speech in the Translanguage English Database (TED) corpus . The hypotheses of lecture speech obtained by automatic speech recogn...

متن کامل

Linguistic Model Adaptation for Speech Summarisation

In this paper we extend the work done on the two-stage summarisation method described in [1] by focusing on adapting the linguistic component to make it more suited for the summarisation task. In particular we examine methods for adapting the linguistic models (LiM) automatically to improve performance, using either unigram, bi-gram or trigram information from different sources of data. Experim...

متن کامل

Speaker dependent model order selection of spectral envelopes

This work introduces a maximum-likelihood based model order (MO) selection technique for spectral envelopes to apply speaker dependent adaptation in the feature-space similar to vocal tract length normalization. Speech recognition systems based on spectral envelopes are using a fixed MO for the underlying linear parametric model. Using a fixed MO over different speakers or channels might not be...

متن کامل

Frame based model order selection of spectral envelopes

Spectral envelopes, using (warped or perceptual) linear prediction or minimum variance distortionless response for the underlying linear parametric model, are widely used in speech recognition systems where the frequency resolution, namely the model order (MO), of the spectrum is kept constant. Modeling different types of phonemes such as vowels or fricatives with the same frequency resolution ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994